Search CORE

10 research outputs found

Computational Methods and Graphical Processing Units for Real-time Control of Tomographic Adaptive Optics on Extremely Large Telescopes.

Author: DIMOUDI SOFIA
Publication venue
Publication date: 01/01/2015
Field of study

Ground based optical telescopes suffer from limited imaging resolution as a result of the effects of atmospheric turbulence on the incoming light. Adaptive optics technology has so far been very successful in correcting these effects, providing nearly diffraction limited images. Extremely Large Telescopes will require more complex Adaptive Optics configurations that introduce the need for new mathematical models and optimal solvers. In addition, the amount of data to be processed in real time is also greatly increased, making the use of conventional computational methods and hardware inefficient, which motivates the study of advanced computational algorithms, and implementations on parallel processors. Graphical Processing Units (GPUs) are massively parallel processors that have so far demonstrated a very high increase in speed compared to CPUs and other devices, and they have a high potential to meet the real-time restrictions of adaptive optics systems. This thesis focuses on the study and evaluation of existing proposed computational algorithms with respect to computational performance, and their implementation on GPUs. Two basic methods, one direct and one iterative are implemented and tested and the results presented provide an evaluation of the basic concept upon which other algorithms are based, and demonstrate the benefits of using GPUs for adaptive optics

Durham e-Theses

Improved Acceleration of the GPU Fourier Domain Acceleration Search Algorithm

Author: Adámek Karel
Armour Wesley
Dimoudi Sofia
Giles Mike
Publication venue
Publication date: 29/11/2017
Field of study

We present an improvement of our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs) (Dimoudi & Armour 2015; Dimoudi et al. 2017). Our new improved convolution code which uses our custom GPU FFT code is between 2.5 and 3.9 times faster the than our cuFFT-based implementation (on an NVIDIA P100) and allows for a wider range of filter sizes then our previous version. By using this new version of our convolution code in FDAS we have achieved 44% performance increase over our previous best implementation. It is also approximately 8 times faster than the existing PRESTO GPU implementation of FDAS (Luo 2013). This work is part of the AstroAccelerate project (Armour et al. 2002), a many-core accelerated time-domain signal processing library for radio astronomy.Comment: proceeding from ADASS XXVII conference, 4 page

arXiv.org e-Print Archive

Oxford University Research Archive

GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory

Author: Adámek Karel
Armour Wesley
Dimoudi Sofia
Giles Mike
Publication venue: Association for Computing Machinery (ACM)
Publication date: 10/04/2020
Field of study

We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language), which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-save algorithm utilizing the NVIDIA FFT library (cuFFT). We demonstrate that by using a shared-memory-based FFT, we can achieved significant speed-ups for certain problem sizes and lower the memory requirements of the overlap-and-save method on GPUs

arXiv.org e-Print Archive

Durham Research Online

Oxford University Research Archive

Bioclimatic rehabilitation of an open market place by a computational fluid dynamics simulation assessment

Author: Anna-Maria Tamiolaki
Apostolos Polyzakis
Argyro Dimoudi
Euterpi Deligiorgi
Sofia Dimoudi
Spyros Lyssoudis
Stamatis Zoras
Vasilis Evagelopoulos
Vasilis Stathis
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/01/2017
Field of study

Crossref

A GPU implementation of the Correlation Technique for Real-time Fourier Domain Pulsar Acceleration Searches

Author: Adamek Karel
Armour Wesley
Dimoudi Sofia
Karastergiou Aris
Ransom Scott M.
Thiagaraj Prabu
Publication venue: 'American Astronomical Society'
Publication date: 15/04/2018
Field of study

The study of binary pulsars enables tests of general relativity. Orbital motion in binary systems causes the apparent pulsar spin frequency to drift, reducing the sensitivity of periodicity searches. Acceleration searches are methods that account for the effect of orbital acceleration. Existing methods are currently computationally expensive, and the vast amount of data that will be produced by next generation instruments such as the Square Kilometre Array (SKA) necessitates real-time acceleration searches, which in turn requires the use of High Performance Computing (HPC) platforms. We present our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs). The correlation technique is applied as a convolution with multiple Finite Impulse Response filters in the Fourier domain. Two approaches are compared: the first uses the NVIDIA cuFFT library for applying Fast Fourier Transforms (FFTs) on the GPU, and the second contains a custom FFT implementation in GPU shared memory. We find that the FFT shared memory implementation performs between 1.5 and 3.2 times faster than our cuFFT-based application for smaller but sufficient filter sizes. It is also 4 to 6 times faster than the existing GPU and OpenMP implementations of FDAS. This work is part of the AstroAccelerate project, a many-core accelerated time-domain signal processing library for radio astronomy.Comment: 20 pages, 9 figures. Accepted for publication in ApJ

arXiv.org e-Print Archive

Oxford University Research Archive

GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory

Author: Adámek K.
Dimoudi S.
Dobashi T.
Karel Adámek
Mike Giles
Sofia Dimoudi
Wefers Frank
Wesley Armour
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Bits Missing: Finding Exotic Pulsars Using bfloat16 on NVIDIA GPUs

Author: Adámek Karel
Armour Wesley
Dimoudi Sofia
Ransom Scott M.
Roy Jayanta
White Jack
Publication venue: American Astronomical Society
Publication date: 01/01/2023
Field of study

The Fourier domain acceleration search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy data sets. This paper quantifies the sensitivity impact of reducing numerical precision in the graphics processing unit (GPU)-accelerated FDAS pipeline of the AstroAccelerate (AA) software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline, spending a large fraction of the runtime computing GPU-accelerated fast Fourier transforms. AA has been modified to use bfloat16 (and IEEE-754 double-precision to provide a “gold standard” comparison) within the Fourier domain convolution section of the FDAS routine. Approximately 20,000 synthetic pulsar filterbank files representing binary pulsars were generated using SIGPROC with a range of physical parameters. They have been processed using bfloat16, single-precision, and double-precision convolutions. All bfloat16 peaks are within 3% of the predicted signal-to-noise ratio of their corresponding single-precision peaks. Of 14,971 “bright” single-precision fundamental peaks above a power of 44.982 (our experimentally measured highest noise value), 14,602 (97.53%) have a peak in the same acceleration and frequency bin in the bfloat16 output plane, while in the remaining 369 the nearest peak is located in the adjacent acceleration bin. There is no bin drift measured between the single- and double-precision results. The bfloat16 version of FDAS achieves a speedup of approximately 1.6× compared to single-precision. A comparison between AA and the PRESTO software package is presented using observations collected with the GMRT of PSR J1544+4937, a 2.16 ms black widow pulsar in a 2.8 hr compact orbit

Durham Research Online